Journal of Theoretical and Applied Information Technology 
15" May 2024. Vol.102. No 9 


SZ 


© Little Lion Scientific 


ISSN: 1992-8645 


2. = 
wrviaa 


E-ISSN: 1817-3195 


EVALUATING THE PERFORMANCE OF XGBOOST AND 
GRADIENT BOOST MODELS WITH FEATURE 
EXTRACTION IN FMCG DEMAND FORECASTING: A 
FEATURE-ENRICHED COMPARATIVE STUDY 


MURARI THEJOVATHI’, DR M.V.P. CHANDRA SEKHARA RAO? 


‘Department of Computer Science and Engineering, Acharya Nagarjuna University Guntur Andhra 
Pradesh, India 
*Department of Computer Science and Engineering, RVR&JC College of Engineering, Guntur, Andhra 
Pradesh, India 
E-mail: | kkutheju@gmail.com , 7 manukondach@gmail.com 


ABSTRACT 

In this paper We are proposing the inclusion of Gradient Boost, another ensemble technique, to broaden the 
scope and potentially improve forecasting accuracy. This research looks at how XGBoost and Gradient 
Boost, two powerful ensemble learning methods, can be used to predict demand in the FMCG sector. The 
suggested method also includes advanced feature extraction techniques to make the model work better. The 
current method uses XGBoost, a well-known and effective gradient-boosting technique that is fast and easy 
to scale. The suggested method includes gradient boost, which is another ensemble technique, as well as 
feature extraction techniques that help find and use the dataset's most important information. The research 
aims to compare the performance of XGBoost and Gradient Boost models in the context of demand 
forecasting for Fast-Moving Consumer Goods (FMCG) data. Additionally, the study incorporates feature 
extraction methods to enhance the models' predictive capabilities. We test both models thoroughly using 
FMCG data to see how well they work in terms of accuracy, reliability, and how quickly they can be run. To 
find the factors that have the most influence on demand prediction, feature extraction techniques like 
Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE) are used. The study's results 
tell us a lot about how well the XGBoost and Gradient Boost models work for predicting demand in the 
FMCG sector. Using feature extraction methods is also meant to find hidden patterns in the data, which will 
help supply chain professionals in the FMCG business make more accurate predictions and better decisions. 
Researchers can use the results of this study to help them choose the best method for their own demand 
forecasting needs. This will improve operational efficiency and cut costs in the FMCG supply chain. 
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1. INTRODUCTION 

Demand forecasting is an important part of supply 
chain management in the Fast-Moving Consumer 
Goods (FMCG) business because it helps keep 
inventory levels low and makes sure that customers 
can get the products they want. XGBoost, also 
known as eXtreme Gradient Boosting, represents a 
strong and scalable implementation of the gradient 
boosting framework. It is highly proficient in 
managing structured and tabular data and is 
extensively utilized for a range of machine-learning 
assignments. Highlighted features consist of 
regularization techniques, parallel computing, and 
effective management of missing values. XGBoost 
has become popular because of its efficiency, 
precision, and capability to manage extensive 
datasets. 

Gradient boosting is an ensemble learning technique 
that builds a predictive model by incrementally 


including weak learners, often decision trees. It 
improves a loss function via gradient descent, with 
an emphasis on correcting deficiencies from 
previous models. Gradient boosting is known for its 
robust predictive power and adaptability, making it 
valuable for regression and classification tasks. 
Feature extraction is an essential process in machine 
learning where useful features are chosen from raw 
data to enhance model performance. It aids in 
reducing dimensionality, improving interpretability, 
and alleviating the curse of dimensionality. By 
extracting significant features, the model may 
prioritize pertinent information, resulting in 
improved generalization and enhanced efficiency. 
Principal Component Analysis (PCA) and Recursive 
Feature Elimination (RFE) are frequently used 
techniques for feature extraction. 
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e Analyze FMCG sales data using statistical and 
machine learning techniques to _ identify 
consumer demand trends in urban and rural areas. 

e Develop forecasting models like XGBoost and 
Gradient Boost model, ensuring data pre- 
processing for accuracy. And Evaluate models' 
effectiveness through performance metrics such 
as accuracy, precision, and recall, reliability, and 
computational efficiency across different 
scenarios. 

e Compare the performance of XGBoost and 
Gradient Boost models specifically in the context 
of demand forecasting for FMCG data. 
Investigates the use of Introduces advanced 
feature extraction techniques to enhance model 
performance, aiming to identify and utilize the 
most important information in the dataset. 

e Proposes the inclusion of Gradient Boost, 
another ensemble technique, and emphasizes the 
use of feature extraction techniques to improve 
the model's performance. 

e Highlights the significance of the study's results 
for supply chain professionals in the FMCG 
business. The improved accuracy and hidden 
pattern discovery are expected to assist in making 
more accurate predictions and better decisions. 


2. LITERATURE SURVEY 

The study discovered that Gradient Boosted 
Decision Trees (GBDT) with Feature Interaction 
improve customer predicting for organizations. 
These strategies capture complicated variable 
interactions to improve demand estimations. 

The paper's multidimensional demand forecasting 
method incorporates historical time-series data, 
product attributes, customer preferences, and 
environmental influences like weather and_ the 
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economy. This holistic approach lets businesses 
construct models that go beyond prediction to 
comprehend high-demand item dynamics in 
different situations. 

The study demonstrates demand forecasting affects 
more than retail. Demand forecasting affects 
inventory management, personnel scheduling, and 
supply chain optimization, making it a critical tool 
for operational efficiency. 

Better forecasting and consumer-driven operations 
are recommended in the research. In the competitive 
retail industry, these components must work together 
to improve customer service and profitability. The 
new paradigm drives companies to prioritize 
customers and compete with current technology. 
The study emphasizes the need of precise demand 
forecasting for enterprises to accommodate customer 
preferences and make _ educated operational 
decisions. According to the paper, demand 
forecasting helps companies adapt swiftly to market 
and consumer developments. 

GBDT with Feature Interaction is applied in rural 
and urban environments, a _ highlight. This 
adaptability shows the technology's potential for 
global precision forecasting. 

In the dynamic retail industry, advanced demand 
forecasting methods are necessary for economic 
sustainability, operational optimization, and data- 
driven decision-making. The paper's comprehensive 
approach, which analyses various variables and 
strategic concerns, makes demand forecasting 
essential to modern firm strategy. Modern 
technologies and methods are vital, suggesting retail 
innovation and adaptation. 


Researcher(s) Methodology Evaluation Criteria Data Utilized Model 
Employed Accuracy 


Error Metrics: MAE, 
MSE, RMSE,R? 


P. M. Pardalos, R. J. 
Hyndman, Y. 
Khandakar 


Time Series 
Forecasting, forecast 
Package for R 


e Time-series data for forecasting Not explicitly 
e Historical time-series data mentioned in 
e Features such as trend and the 
seasonality information 
e External factors affecting time 
series 


O. I. Oriekhoe, B. I. 
Ashiwaju, K. C. 
Themereze, U. Ikwue 


Review of Big Data 
in FMCG Supply 
Chains 


Sales Forecasting 
Model for New- 
Released Products 


S. Hwang, G. Yoon, 
E. Baek, B.-K. Jeon 


e Efficiency and 
Optimization Metrics 

e Cost-Benefit Analysis 

e Scalability and 
Adaptability 


Error Metrics: MAE, 
MSE, RMSE 


e Big data sources such as RFID, Not 
IoT, and sensors mentioned 
e Market-specific data for the 
African FMCG sector 
e Strategies employed by U.S. 
companies in FMCG supply 
chains 
Sales data for new-released and 
short-term mobile phones 
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T. Huang, R. Fildes, 
D. Soopramanien 


Competitive 90.34% 
Information in 
FMCG Sales 


Forecasting 


Error Metrics: MAE, 
MSE, RMSE 


FMCG retail product sales data 


J. Henzel, M. Sikora Not explicitly 


mentioned 


90.25% 


95.56% 


Gradient Boosting e Performance indicators data for 
Application in 


FMCG Retail 


Error Metrics: MAE, 

MSE, RMSE, Precision, promotions 

Recall, F1-Score e Historical promotion efficiency 

Area Under the Curve data 

(AUC-ROC) e FMCG retail sales data 

Feature Importance e Product-specific attributes and 

Cross-Validation features 

Techniques e External factors influencing 
promotion outcomes 

e Competitor information 

e Demographic data of target 
customer segments 

Historical sales and purchase data 

for different product categories 

Pricing information, promotional 

strategies, and marketing efforts 

Consumer behavior data 


S. Gelper et al. Identifying Demand 
Effects in Product 


Categories 


Regression Analysis 
Cross-Category 
Demand Effects 
Causal Inference 
Methods 

Statistical Significance 
Tests 

Time-Series Analysis 
Error Metrics: MAE, 
MSE, RMSE, Precision, 
Recall, Fl-Score 

Area Under the Curve 
(AUC-ROC) 

Feature Importance 
Cross-Validation 
Techniques 

Error Metrics: MAE, 
MSE, RMSE 


M. L. Demircan, E. 
Merdan 


Order Prediction 
Methodology with 
Fuzzy Sets 


Vendor-Managed Inventory 
System in FMCG Sector 


Applying Deep 
Learning to Forecast 
Demand 


Vietnamese FMCG Company 


93.570 


3. METHODOLOGY Introduces cutting-edge feature —_ extraction 
techniques to boost model performance by 
We are proposing the inclusion of Gradient Boost, identifying and leveraging the most crucial 


another ensemble technique, to broaden the scope 
and potentially improve forecasting accuracy. 

1. Examining XGBoost and Gradient Boost: 
Seeks to analyze the performance of XGBoost and 
Gradient Boost models, focusing on demand 


forecasting for FMCG data. 
2. Methods’ for Extracting Features: 
Employs feature extraction techniques like 


Principal Component Analysis (PCA) and Recursive 
Feature Elimination (RFE) to pinpoint significant 
factors in predicting demand. 
3. Comprehensive Testing and Assessment: 
Performs thorough testing of both models with 
FMCG data, assessing accuracy, reliability, and 
computational efficiency. 

Understands the importance of demand forecasting 
in the FMCG industry to uphold ideal inventory 
levels and meet customer expectations. Concentrates 
on utilizing XGBoost and Gradient Boost, which are 
well-known as effective ensemble learning methods, 
to forecast demand in the FMCG _ industry. 


information in the dataset. Recognizes XGBoost as 
a widely recognized and powerful gradient-boosting 
method that is both rapid and scalable, serving as the 
standard approach. 

The research focuses on the impact of feature 
extraction on FMCG demand forecasting, providing 
insights for decision-making, and improving supply 
chain management. It highlights the importance of 
understanding which model performs better in this 
sector, as it can enhance operational efficiency, 
reduce excess inventory, and improve resource 
allocation. The findings can also be generalizable, 
offering insights into the performance of XGBoost 
and Gradient Boost models in other forecasting 
domains. 


4. RESULTS 

Common metrics for binary classification are 
Accuracy, Precision, Recall, Fl-score 

Accuracy measures the overall correctness of the 
model. 
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ACC TP +TN + FN datasets are split into training and testing sets. 
- —TP+TN XGBoost regressor models and Gradient boost 


where: TP: True Positives, TN: True Negatives. FP: 
False Positives, FN: False Negatives 

Precision (P) or Positive Predictive Value (PPV). 
Precision measures the accuracy of positive 
predictions. 


_ TP 

- TP+FP 
Recall (R) or Sensitivity or True Positive Rate 
(TPR). Recall measures the ability of the model to 
capture all positive instances. 


ne oo ern 
0.99971 | 71.0115 | 0.99995 
ostML 5439 8585 6065 7165 
TP 
~ TP + FN 


Fl-score is the harmonic mean of precision and 
recall. 


PX R 
F1=2*x 
P+R 


ROC curve represents the trade-off between 
sensitivity and specificity. AUC-ROC measures the 
area under this curve. AUC — ROC € [0,1| 

For XGBoost and Gradient Boost, the overall 
objective function can be represcie’ as: 


Objective Function -> L(y, Vi) + 5 0.(fk) 


k=1 
Where L(yi, Vi) is is ihe ere loss function for 


binary classification 
O(fk) is the regularization term 
f , represents the k-th tree in the ensemble. 
Encode Categorical Features 
Categorical features like “WH_capacity_size’ and 
‘zone’ are encoded using Label Encoding to 
transform them into numeric format suitable for the 
machine learning model: 
categorical features = ["WH_capacity_size', 'zone'] 
After that we can Handle Missing Values.The code 
calculates medians for numeric columns only and 
fills missing values in these columns with their 
respective medians: 

Mean= Number of sa SEL values 

y¥Non-missin values 

Median=Middle value of sorted non- 

missing values 

Mode=Most frequently occurring non- 

missing value 
Features that are thought to influence the target 
variable (product _wg ton’) are selected, and the 


regression models for rural and urban data are 
initialized and trained. 


: m 
" 
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Boost 


Table I : Comparison of performance metrics of 
XGBOOST and Gradient Boost models 

Performance metrics for both Models 
The models are evaluated using RMSE (Root Mean 
Squared Error) and R? (coefficient of determination) 
metrics to assess their accuracy and explanatory 
power 

Table 2 : performance metrics of XGBOOST and 

Gradient Boost models for rural and urban 

Scatter plots are generated to visually compare 
actual and predicted values for both rural and urban 
models. 


Rural: Actual vs. Predicted Urban: Actual vs. Predicted 


o # r 
Figure: 1 Visuaheanon: Actual vs. Preqiied Values 
(XgBoost ML) 
Rural: Actual vs. Predicted Urban: Actual vs. Predicted 
Ps 
56 Ps : o 


10000 20000 30000 40000 50000 10000 20000 = 30000 4000050000. 


Figure:2 Visualization: 

Actual vs. Predicted Values (GBDT ML) 
Histograms are plotted to analyze the distribution of 
prediction errors (differences between actual and 
pete? values) 


Rural: Error Distributio Urban: Error Distribution 


Figure 3 : Visualization: 
Error Distribution (XgBoost ML) 
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Rural: Error Distribution Urban: Error Distribution 


Figure 4 : Visualization: 
Error Distribution (GBDT ML) 
This comprehensive approach not only builds and 
evaluates models for different subsets of the data 
(rural vs. urban) but also provides insights through 
visual analysis, helping to understand model 
performance and error characteristics in depth. 
The RMSE values for the Gradient Boosting and 
XGBoost models, obtained through the ensemble 
technique, were found to be shown in the table. 


RMSE _ for 
Urban 
Model 


1556.04524 1616.62442 
GradiemtBoostML | 197.0485439 71.01156065 
Table 3 : RMSE Values for both Models 


Model Comparison: RMSE Values 


RMSE _ for 


uwCpEe Rural Model 


XgBoostML 


800 + 


Gradient Boosting XGBoost 


Figure 5: Comparison between the proposed and 
existing models 


Lower RMSE values indicate better predictive 
accuracy, and in this case, the Gradient Boosting 
model demonstrated a lower RMSE compared to 
XGBoost, suggesting superior performance in terms 
of minimizing prediction errors. 


5. CONCLUSION 


This research proposes using Gradient Boost, an 
ensemble methodology, in addition to the known 
XGBoost method, to enhance demand forecasting in 
the fast-moving consumer goods (FMCG) industry. 
The study utilizes sophisticated feature extraction 
techniques, including principal component analysis 
(PCA) and recursive feature elimination (RFE), to 
improve model performance. We conducted a 
thorough assessment of the XGBoost and Gradient 
Boost models using FMCG data to evaluate their 
accuracy, dependability, and efficiency. The study's 
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findings illuminate the efficacy of XGBoost and 
Gradient Boost in forecasting demand, offering 
significant insights for supply chain experts in the 
FMCG sector. By using gradient boost and feature 
extraction techniques, the goal is to reveal concealed 
patterns in the data, providing a deeper insight into 
the key elements influencing demand forecasting. 
This intelligence is essential for making well- 
informed choices and _ enhancing operational 
efficiency in the FMCG supply chain. 

The research adds to the current literature by 
providing a comparative analysis of two influential 
ensemble learning algorithms in the context of 
demand forecasting for FMCG data. The study 
indicates that by combining XGBoost with Gradient 
Boost and feature extraction approaches, there is 
potential to expand the range and enhance the 
accuracy of forecasting. 

The study's results are a significant resource for 
academics and practitioners, helping them make 
educated decisions on the most appropriate 
methodologies for their demand forecasting 
requirements. Supply chain experts in the FMCG 
sector may improve their forecasting capacities and 
decision-making by using the insights obtained from 
this research. This is anticipated to enhance 
operational efficiency and decrease costs in the 
FMCG supply chain. 


6. FUTURE ENHANCEMENT 


In the future, this research lays the foundation for 
several promising avenues in the realm of demand 
forecasting for the fast-moving consumer goods 
(FMCG) industry. Firstly, the integration of cutting- 
edge deep learning techniques, such as recurrent 
neural networks (RNNs) or attention mechanisms, 
could be explored to further refine the model's 
capacity to capture intricate temporal dependencies. 
Additionally, the dynamic nature of market 
conditions could be addressed by developing 
adaptive feature selection strategies or even 
implementing real-time forecasting capabilities. The 
continued pursuit of enhanced model explain ability 
remains crucial, with the potential exploration of 
advanced techniques in explainable AI to ensure 
transparency and foster trust among stakeholders. 
Furthermore, assessing the cross-industry 
applicability of the ensemble learning approach and 
conducting continuous evaluations and updates to 
the model will contribute to its longevity and 
relevance. Collaboration with industry experts, the 
exploration of additional external data streams, and 
the development of user-friendly interfaces can 
collectively propel this research towards practical 
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implementation, offering sustainable benefits for 


supply chain experts in the FMCG sector. 
Ultimately, the future scope involves a 
comprehensive and _ evolving approach _ that 


integrates emerging technologies, addresses real- 
world challenges, and aligns with the evolving 
landscape of demand forecasting in the FMCG 
industry. 
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